Package Overview
History of Plotly
Plotly is a company founded by Alex Johnson, Jack Parmer, Chris Parmer, and Matthew Sundquistin 2013. While working for the data science program of a California-based cleantech company, Alex, Jack, Chris, and Matthew found themselves facing the seemingly simple problem of having to find a way to meaningfully and easily share the data they’ve gathered. Even after collecting, analyzing, and sorting data, they felt that there were still important questions that had to be answered. These questions included:
- How do we share what we’ve learned with others in a meaningful way?
- How do we enable others to explore our data?
- Can we give others access to the models to explore on their own?
To answer these questions they decided to create a tool to make scientific and data analysis simple. Plotly was actually originally a JavaScript graphing library which was eventually converted into an R package which allowed the creation of graphs using R data and syntax from ggplot2. Their focuses with plotly were to use the web as a data science platform, power discovery with open source, provide unlimited flexibility, remove language as a barrier, and enable shared goals across the organization. In 2013 and 2014, Jack, Chris, Matthew, and Alex officially founded “plotly” and opened their Montreal headquarters.
Background of Plotly
Plotly is a graphing library that makes interactive, publication-quality graphs. Oftentimes with data visualization, people run into issues with making graphs both interesting and informative. However, with plotly this can be done easily as plotly allows for the creation of a wide variety of visualizations ranging from basic bar-graphs to maps, 3d charts, and animated charts with a few lines of code. In addition to being easy to use, plotly is also an open-source package that can be found on numerous platforms including R, Python, .NET, JS, and Julia.
Version History
Current Version: 4.9.3 (last updated 1/10/2021)
Dependencies
R (≥ 3.2.0), ggplot2 (≥ 3.0.0)
Usage
plotly offers a wide variety of interactive options allowed on a web-based application which include:
Examples of Usage
plotly converts plots to an interactive, web-based version. It allows for zooming in or out of a graph, selection of data points
- zooming in or out of a graph
- selection of data points upon mouse hover
- panning through a graph
- downloading a graph as a PNG
- box selecting or lasso selecting data points
- option to compare nearby data points upon hover
visual formats supported by plotly include:
- basic charts (scatter and line plots, bar charts, pie charts, bubble charts, and more)
- statistical charts (2D Histograms, box plots, histograms, error bars, violin plots, and more)
- scientific charts (log plots, contour plots, heatmaps, network graph, ternary contour plots, and more)
- financial charts (time series, candlestick charts, OHLC charts, waterfall charts, funnel charts, and more)
- maps (chloropleth maps, scatter plots on maps, mapbox density, lines on maps, mapbox layers, and more)
- 3D Charts (3D scatter plots, 3d line plots, 3d surface plots, 3d mesh plots, 3d cone plots, and more)
- subplots (multiple axes, map subplots and small multiples, inset plots, subplots, 3d subplots, and more)
- animated plots
Bubble Chart
One of the most basic charts that you are able to create in plotly is the bubble chart. A bubble chart is a scatter plot whose markers have variable color and size. In this example we will be making a bubble plot from scratch using plotly and the built in mtcars dataset in R.
#plot_ly function allows for creation of any plotly graph
figure <- plot_ly(
data = mtcars, #specified dataset, in this case we are using mtcars
x = ~mpg, #variable that corresponds with x-axis
y = ~hp, #variable that corresponds with y-axis
text = rownames(mtcars), #specifies text that shows up when you hover over a data point
type = 'scatter', #specified graph type, in this case we are using the scatter plot
mode = 'markers',
marker = list(size = ~cyl, opacity = 0.5, color = 'green')) #marker parameter specifies the size, opacity, and color of each of our bubbles.
figure <- figure %>% #plotly uses the pipe function from dplyr for formatting the layout of the graph (x-axis, y-axis, and title)
layout(title = 'Gas Efficiency and Speed of Cars in mtcars dataset',
xaxis = list(title = 'Miles Per Gallon'),
yaxis = list(title = 'Horsepower'))
figureYou can also add more to what can be shown in a plotly graph by adding the ‘text’ parameter ‘hoverinfo’.
figure <- plot_ly(
data = mtcars,
x = ~mpg,
y = ~hp,
hoverinfo = 'text', #hoverinfo parameter added
text = ~paste('Car Model:', rownames(mtcars), '<br>Displacement:', disp), #text parameter changed
type = 'scatter',
mode = 'markers',
marker = list(size = ~cyl, opacity = 0.5, color = 'green'))
figure <- figure %>%
layout(title = 'Gas Efficiency and Speed of Cars in mtcars dataset',
xaxis = list(title = 'Miles Per Gallon'),
yaxis = list(title = 'Horsepower'))
figureHeatmaps
Heatmaps are graphs that use a color scale to illustrate relationships between variables. In order to easily create a heatmap in plotly, you have to first convert your dataset into a matrix. This can be done with the as.matrix function in R. After that, it is important to normalize the different variables since the heatmap uses the same color scale for all variables so some variables with lower/higher numbers will not show up on the color scale. In this example, we will be using the mtcars dataset again.
mtcars_matrix <- as.matrix(mtcars) #turning mtcars dataset into a matrix
mtcars_matrix <- apply(mtcars_matrix, 2, function(x) {x/mean(x)}) #normalizing the variables within mtcars_matrix
plot <- plot_ly (x = colnames(mtcars_matrix), #specifies variables on x-axis
y = rownames(mtcars_matrix), #specifies variables on y-axis
z = mtcars_matrix, #specifies data being used
type = "heatmap") %>% #specifies type of graph
layout(margin = list(l = 120)) #specifies margins of the graph
plot3D Graphs
Plotly can also be used to create 3D versions of graphs, such as 3D scatterplots. In the following example from the Clustering Lab, a 3D plotly scatterplot is shown that graphs NBA players based on minutes played, field goals, and points. Hover over each point to see the player name and their salary.
#importing dataset and viewing data
nba = read_csv("/cloud/project/nba2020-21.csv")
#a function for pre-processing of the nba data
nba_pre_processing <- function(nba){
stats <- nba[, c("Player","Tm", "Pos", "FG", "FT", "PTS", "MP")] %>%
mutate(Player = as.factor(Player)) %>%
mutate(Pos=as.factor(Pos)) %>%
mutate(Tm=as.factor(Tm)) %>%
distinct(Player, .keep_all = TRUE) #getting rid of duplicate names
x <- gsub("[^[:alnum:]]", " ", stats$Player) #getting rid of special chars
mutate(stats, Player = x)
}
#calling the function
nba_stats <- nba_pre_processing(nba)
view(nba_stats)
#pre-processing for salary data
salary = read_csv("/cloud/project/nba_salaries_21.csv")
#function for pre-processing salary data
salary_pre_processing <- function(salary){
salary %>%
mutate(Player = as.factor(Player)) %>%
distinct(Player, .keep_all = TRUE) #getting rid of duplicate names
x <- gsub("[^[:alnum:]]", " ", salary$Player) #getting rid of special chars
mutate(salary, Player = x)
}
#calling salary function
salary_stats <- salary_pre_processing(salary)
view(salary_stats)#combining nba data with salary data into one chart
nba_combined <- merge(nba_stats, salary_stats, by="Player", all=TRUE)
#fixing column names
colnames(nba_combined) <- c("Player", "Tm","Pos", "FG", "FT","PTS", "MP", "Salary")#Creating a 3D graph
fig <- plot_ly(nba_combined,type = "scatter3d",mode = "markers", x = ~MP, y = ~FG, z = ~PTS,
text = ~paste('Player:', Player,'Salary:', Salary))
fig <- fig %>% layout(title = 'NBA Player Rankings Based on Minutes Played, Field Goals, and Points')
fig <- fig %>% layout(scene = list(xaxis = list(title = 'Minutes Played'),
yaxis = list(title = 'Field Goals'),
zaxis = list(title = 'Points')))
figSimilar Packages
ggplot2
ggplot2 is probably the most similar package to leaflet. You can even convert from ggplot2 to plotly, and they share a lot of the same capabilities as far as creating graphics for data visualization. However, plotly has slightly more capabilaties as far as creating fully interactive graphs. Plotly is also available for Python, MATLAB, and React, whereas ggplot2 is made for R.
The ggplot2 graph below shows a simple graph created in ggplot2 that graphs GDP per capita vs. Life Expectancy.
leaflet
Leaflet is a very popular R package used to make interactive maps. It expands greatly on the map visualization tools in plotly.
Some features include:
- interactive panning and zooming
- creating maps using combinations of map tiles, map markers, polygons, lines, popups, and more
- create maps directly in RStudio
- embed maps into RMarkdown documents and Shiny apps
- display maps in non spherical mercator projections
- easy rendering of spatial objects from the sp or sf packages, or data frames with latitude/longitude columns
- can change features in maps using plugins from the leaflet plugins repository
The leaflet graph below shows an interactive graph of U.S population density.
dygraphs
dygraphs is an R package thats focuses on interactive time series visualization.
Some features include:
- automatic plotting of xts time-series objects or other time series objects that are convertible to xts (eXtensible Time Series)
- range selector interface for interactive interactive panning and zooming,
- interactive series/point highlighting with different visual options,
- highly configurable axis and series display
- graph overlays including shaded regions, event lines, and annotations
- make graphs in the R console
- embed graphs into RMarkdown documents and R Shiny apps
The dygraph below shows an interactive time-series graph showing the Discharge of River Danube.